Does temporal discounting explain unhealthy behavior? A systematic review and reinforcement learning perspective
نویسندگان
چکیده
The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a "model-based" (or goal-directed) system and a "model-free" (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.
منابع مشابه
Homeostatic reinforcement learning for integrating reward collection and physiological stability
Efficient regulation of internal homeostasis and defending it against perturbations requires adaptive behavioral strategies. However, the computational principles mediating the interaction between homeostatic and associative learning processes remain undefined. Here we use a definition of primary rewards, as outcomes fulfilling physiological needs, to build a normative theory showing how learni...
متن کاملTemporal-Difference Reinforcement Learning with Distributed Representations
Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential d...
متن کاملIncidental rewarding cues influence economic decisions in people with obesity
Recent research suggests that obesity is linked to prominent alterations in learning and decision-making. This general difference may also underlie the preference for immediately consumable, highly palatable but unhealthy and high-calorie foods. Such poor food-related inter-temporal decision-making can explain weight gain; however, it is not yet clear whether this deficit can be generalized to ...
متن کاملUnhealthy diets, obesity and time discounting: a systematic literature review and network analysis
There is an increasing policy commitment to address the avoidable burdens of unhealthy diet, overweight and obesity. However, to design effective policies, it is important to understand why people make unhealthy dietary choices. Research from behavioural economics suggests a critical role for time discounting, which describes how people's value of a reward, such as better health, decreases with...
متن کاملA Reinforcement Learning Method for Maximizing Undiscounted Rewards
While most Reinforcement Learning work utilizes temporal discounting to evaluate performance, the reasons for this are unclear. Is it out of desire or necessity? We argue that it is not out of desire, and seek to dispel the notion that temporal discounting is necessary by proposing a framework for undiscounted optimization. We present a metric of undiscounted performance and an algorithm for fi...
متن کامل